Mon. Not. R. Astron. Soc. 000, 1-?? (2001) Printed 2 February 2008 (MN 1*1^ style file vl.4) 



Goodness-of-Fit Tests to study the Gaussianity of the 
MAXIMA data 



L. Cayon^'^ , F. Argiieso^ , E. Martmez-Gonzalez^ and J.L. Sanz^. 

1. Instituto de Fisica de Cantabria, Fac. Ciencias, Av. los Castros s/n, 39005 Santander, Spain 

2. Physics Department, Purdue University, 525 Northwestern Avenue, West Lafayette, IN 47907-2036, USA 

3. Opto, de Matemdticas, Universidad de Oviedo, c/ Calvo Sotelo s/n, 33007 Oviedo, Spain 



m 
o 
o 

(N 



> 

o 

cn 

o 

:^ 

o 



2 February 2008 



ABSTRACT 

Goodness-of-Fit tests, including Smooth ones, are introduced and applied to detect 
non-Gaussianity in Cosmic Microwave Background simulations. We study the power 
of three different tests: the Shapiro- Francia test (1972), the uncategorised smooth 
test developed by Rayner and Best(1990) and the Neyman's Smooth Goodness-of-fit 
test for composite hypotheses (Thomas & Pierce 1979). The Smooth Goodness-of- 
Fit tests are designed to be sensitive to the presence of "smooth" deviations from 
a given distribution. We study the power of these tests based on the discrimination 
between Gaussian and non-Gaussian simulations. Non-Gaussian cases are simulated 
using the Edgeworth expansion and assuming pixel-to-pixcl independence. Results 
show these tests behave similarly and are more powerful than tests directly based 
on cumulants of order 3, 4, 5 and 6. We have applied these tests to the released 
MAXIMA data. The applied tests are built to be powerful against detecting deviations 
from univariate Gaussianity. The Cholesky matrix corresponding to signal (based on 
an assumed cosmological model) plus noise is used to decorrelate the observations 
previous to the analysis. Results indicate that the MAXIMA data are compatible 
with Gaussianity. 

Key words: Cosmic Microwave Background. Methods: data analysis 



X 
S3 



1 INTRODUCTION 

The detection of non Gaussianity in Cosmic Microwave 
Background (CMB) maps will question the validity of Stan- 
dard Inflationary theories. These theories assume the ex- 
istence of a single scalar field as well as linear theory, to 
generate the cosmological perturbations that will later de- 
velop into the structures observed in the Universe. Some 
alternative scenarios will include the presence of topologi- 
cal defects (Durrer 1999) or isocurvature fluctuations (Pee- 
bles 1999a, b), multi-field infiation models (Bernardeau & 
Uzan 2002 and references therein) and stochastic infiation- 
ary scenarios generating features in the inflaton potential 
(Starobinsky 1986). Moreover, in a recent work by Acqua- 
viva et al. (2002) it is shown how the inclusion of secondary 
effects will modify the predictions of one single scalar field 
theories. 

All the above alternatives to Standard Inflation will re- 
sult in non Gaussian CMB temperature fluctuations. The 
type and amount of non Gaussianity to be observed in CMB 
maps is under study at the moment. There have been several 
works exploring the implications on CMB observations of 



different physical mechanisms that will generate non Gaus- 
sianity (Komatsu & Spergel 2001, Landriau & Shellard 2002, 
Acquaviva et al. 2002, Gupta et al. 2002, Gangui et al. 1994). 
Tests of a scenario including a quadratic term in the gravita- 
tional potential have been performed on COBE-DMR CMB 
data (Komatsu et al. 2002, Cayon et al. 2002). The poor 
resolution of these data does not provide a very good con- 
straint of the non-linear coupling parameter accounting for 
the contribution of the quadratic term. However, it can be 
concluded that the method based on the Spherical Mexican 
Hat wavelet provides a better constraint than the one based 
on the bispectrum. 

At present there is a large effort to implement differ- 
ent statistical tools that will allow us in the future to test 
the Gaussianity of observed CMB data. The power of the 
different methods will vary depending on the type of non 
Gaussianity present in the data. In this paper we propose 
three Goodness-of-flt tests and study their power on simu- 
lations. We simulate Gaussian and non Gaussian maps. The 
later are performed using the Edgeworth expansion. This ex- 
pansion was for the flrst time used to simulate non Gaussian 
CMB maps by Martinez-Gonzalez et al. 2002. The proposed 
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Figure 1. Distribution the Shapiro-Francia statistic obtained 
from 50000 independent Gaussian simulations of maps with 2164 
pixels (independent pixel-to-pixel). 



is very close to one. Deviations from Gaussianity will re- 
sult in values smaller than one. The distribution of the SF 
statistic for independent Gaussian realizations is presented 
in Figure 1. 

Smooth goodness-of-fit methods have been constructed 
as powerful tests against distributions that might deviate 
"smoothly" from the normal A'^(0, 1) one. These methods are 
sensitive to the presence of skewness (S), kurtosis (K) and 
higher order moments. Some of them are defined based on 
orthonormal functions (Rayner & Best 1990) whereas others 
make use of powers of the distribution function (Thomas & 
Pierce 1979). The uncategorised smooth models proposed 
by Rayner & Best (1990) Sk make use of the Hermite- 
Chebyshev polynomials P„ and are defined by 

k N 

where hi{xj) = Pi{xj)/ y/\n). The statistic Sk is related to 
cumulants of order ^ k, such that for example: 

Si=N <x>\ S2=Si + {N/2)(< > -l)^ 



methods are specially well suited to analyse data covering a 
region of the sky and not necessarily taken on a regular grid. 
We have applied them to the recently released MAXIMA 
data (Balbi et al. 2000, Hanany et al. 2000). The MAXIMA 
data have already been tested against Gaussianity by Wu 
et al. 2001 and Santos et al. 2002a,b. Both works conclude 
that the data are compatible with Gaussianity. 

The paper is organized as follows. Section 1 is dedicated 
to present the Goodness-of-fit tests as well as to test them 
on non Gaussian simulations generated using the Edgeworth 
expansion. An application to the MAXIMA data is pre- 
sented in Section 2. Discussion and conclusions are included 
in Section 3. 



2 GOODNESS-OF-FIT STATISTICS 

Given a sample of uncorrelated and normalized (zero mean, 
dispersion one) CMB data, the question we want to answer 
is "how well the data agree with the population of a Gaus- 
sian distribution A'^(0, 1)" (the methods here used are also 
suited for testing a composite hypothesis where mean and 
dispersion are not specified). 

Many goodness-of-fit methods have been developed to 
test normality (for a review see D'Agostino & Stephens 
1986). Out of all we have chosen to apply the Shapiro- 
Francia one (Shapiro & Francia 1972), a modification of the 
Shapiro- Wilk test for large data sets. Implementation of the 
Shapiro-Francia requires the following steps: 

1) - Estimation of the 1 dimensional array c corresponding 
to the expected sorted values obtained from independent 
Gaussian simulations N(0, 1). We define b = c/\\c\\^'''^ 

2) - For a given sorted data set x , the Shapiro-Francia 
statistic SF is defined as follows 



where A*' is the number of data and a is the dispersion. The 
expected value of this statistic for the Gaussian distribution 



S'3 = Sa + {N/6){< x'^ > -3<x >)^ 

Si = S3 + {N/2A){< x" > -6<x^ > +3f, 

where <> denotes the average. And if < x >= and < 
>= 1, then S3 = {N/6)S'^ and S4, = S3 + {N/2A)K^. 
The Sk statistic is distributed as a for the Gaussian case. 
We also make use in this paper of the smooth goodness- 
of-fit test Wk proposed by Thomas & Pierce (1979) as a 
modification of Neyman's one. This test is built on powers 
of the normal distribution function and compares sample 
means of these quantities with the expected values under 
the null hypothesis that the data correspond to a population 
sample of the normal distribution. 

k i 

= E E "'j^j] ^ = 1' 2, 3, 4, ... 

'^^■ = a^EL(2^'(-'-)-TT7^ 

where for the normal distribution y{xr) = er/(^^^^), being 
^, a the mean and dispersion, and the coefficients aij are 
given in Table 3 by Thomas & Pierce (1979) (e. g. an — 
16.3172,021 = -022 = -27.3809). Therefore, for fc = 1,2, 
for example, the statistics are 

16.3172^ p^iv 1 2 

27.3809%^^^^ , 2, N 1 /X 1x12 

W2 = Wl + [2^^.^^(y (^r) - - - y{Xr) + -)] . 

These statistics, under the null hypothesis, are distributed as 
Xk ■ The distributions of values of the two smooth goodness- 
of-fit statistics for 50000 independent Gaussian realizations 
are presented in Figures 2 and 3. In both cases, and in the ex- 
amples and applications presented below, the data are renor- 
malized to zero mean and unit variance before the tests are 
applied. Because of that the Sk statistics appear as Xk-2 dis- 
tributions. The distributions of the two statistics converge to 
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Figure 2. Distribution of Sk smooth-goodncss-of-fit statistics. From left to right, top to bottom Sk corresponding to Sz, Si,, S^, S%. 
Distributions obtained from 50000 independent Gaussian simulations of maps with 2164 pixels (pixel-to-pixel independent). 



the expected ones. However one can see that the Wk statistic 
converges faster to the expected distribution than the Sk- 

As an example of how well these methods work on sep- 
arating Gaussian from nouGaussian data, we have applied 
them to simulated non Gaussian data following distribu- 
tions as in Martmez-Gonzalez et al. (2002). An array with 
2164 independent pixels constitutes a simulation. The cho- 
sen number of pixels fits the number of central pixels in 
the MAXIMA map, later selected for analysis. The pixel 
value is drawn from a non Gaussian distribution obtained 
through the Edgeworth expansion characterised by a given 
skewness and kurtosis. Given an input skewness and kurto- 
sis, the mean and dispersion value for these two statistics 
in the resulting simulations are given in Table 1, as well 
as the skewness and kurtosis power to discriminate between 
Gaussian and non Gaussian distributions. 10000 simulations 
were performed and the power of the Smooth Goodness of 
fit statistics presented in this paper is given for several input 
skewness and kurtosis values in Table 2 and Table 3. As one 
can see from these results, most of the presented goodness- 
of-fit statistics have more power than the directly calculated 



cumulants. The W2 statistic is the one with higher discrimi- 
nating power in most of the cases. Even so, the result is very 
dependent on the underlying distribution. 

The cases simulated above are based on an Edgeworth 
expansion considering all cumulants of order greater than 
2 equal to zero, except for those of order 3 and 4. The 
smooth goodness of fit statistics combine information from 
cumulants of any order and in particular of orders below 
7. We have performed simulations based on the Edgeworth 
expansion including non null cumulants up to order 6. The 
simulated maps do not preserve the input cumulant values. 
Mean and dispersion values of the cumulants calculated out 
of 10000 simulations are presented in Table 4. The power 
of the cumulants to differentiate between a Gaussian and 
another distribution (based on the Edgeworth expansion) is 
given at the 95% in Table 5. 

The smooth goodness of fit statistics as well as the 
Shapiro- Francia test have been calculated for the simulated 
maps including cumulants up to order 6. Powers at the 95% 
confidence level are presented in Table 6. As one can see in 
all cases the Shapiro- Francia test is the most powerful of all 
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Figure 3. Distribution of Wk smootli-goodness-of-fit statistics. From left to right, top to bottom Wk corresponding to Wi, W2, W3, W4. 
Distributions obtained from 50000 independent Gaussian simulations of maps with 2164 pixels (pixel-to-pixel independent). 



Table 1. Average and dispersion for the skewness and kurtosis values obtained from 10000 simulations drawn from Edgcworth 
expansions assuming skewness and kurtosis values denoted by S&cK (input). The power of these two statistics is also given in 
columns 4 and 5. 



S&K(input) 


Mean/Disp (S) 


Mean/Disp (K) 


Power(95/99%) (S) 


Power(95/99%) (K) 


0.0&0.4 


5.95e-4/0.0607 


0.3170/0.1205 


7.86/2.13 


87.85/65.57 


O.l&O.O 


0.0965/0.0503 


-0.0342/0.0938 


57.51/28.36 


1.65/0.19 


0.1&0.3 


0.0982/0.0581 


0.2309/0.1157 


57.49/32.25 


67.26/37.05 


0.1&0.4 


0.0968/0.0604 


0.3180/0.1239 


56.22/31.6 


87.72/65.19 


0.1&0.5 


0.0976/0.0624 


0.4061/0.1273 


56.46/32.67 


97.08/86.75 


0.1&0.7 


0.0976/0.0624 


0.5820/0.1391 


57.27/35.62 


99.96/99.17 


0.3&0.3 


0.2944/0.0565 


0.2323/0.1320 


99.96/99.86 


64.90/38.87 



the implemented tests. It is difficult to asses whether some 
tests will be more convenient than others when trying to de- 
tect non Gaussianity. Moreover, as pointed out by Bromley 
& Tegmark (1999) even if non Gaussianity is detected by one 
statistical method, the confidence level has to be established 
taking into account all the methods applied. 



3 MAXIMA DATA ANALYSIS. RESULTS 

The goodness-of-fit tests previously described are optimal 
for testing univariate Gaussian distributions. We therefore 
first of all transform the MAXIMA data by multiplying it 
by the inverse of the Cholesky matrix corresponding to sig- 
nal plus noise. We first calculate the Cholesky decomposi- 
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Table 2. Power at 95% and 99% confidence level for the different statistics presented in this work (notation used for the table 
"95%/99%"). These results are based on 10000 Gaussian and non Gaussian simulations. The non Gaussian ones were obtained 
from the Edgeworth expansion for different values of skewness and kurtosis (skewness S and kurtosis K input values indicated in 
column 1). 



S&K 


S:i 


54 


55 


56 


Wi 


Wa 


Ws 


W4 


0.0&0.4 


8.95/2.46 


74.04/53.18 


66.48/36.26 


71.98/23.32 


6.80/1.67 


83.03/64.75 


77.20/58.76 


74.38/52.97 


O.l&O.O 


44.62/20.51 


33.4/12.93 


23.63/5.11 


16.7/0.65 


41.13/20.26 


33.22/14.31 


29.16/12.38 


25.61/9.11 


0.1&0.3 


46.52/25.03 


68.46/47.10 


62.03/31.30 


61.27/15.97 


45.99/25.31 


74.22/53.98 


68.43/48.04 


65.38/42.2 


0.1&0.4 


45.58/24.99 


84.35/67.6 


79.38/52.71 


81.91/35.99 


46.35/25.42 


90.79/77.35 


87.24/72.32 


84.81/66.26 


0.1&0.5 


46.35/26.03 


94.56/86.13 


92.80/74.72 


95.00/62.99 


47.64/27.23 


98.27/93.92 


97.04/91.54 


96.23/88.04 


0.1&0.7 


47.96/28.82 


99.75/98.92 


99.56/97.05 


99.91/97.06 


50.88/31.07 


99.99/99.95 


99.97/99.89 


99.96/99.75 


0.. 3,^-0.3 


99.95/90.75 


99.90/99.51 


00. SO/09. 58 


90.80/02.81 


00.00/00,85 


00.05/00.72 


00.02/00. 10 


90.88/09.00 



Table 3. Power at 95% and 99% confidence level for the Sliapiro-Francia statistic (notation used for the table "95%/99%"). These 
results are based on 10000 Gaussian and non Gaussian simulations. The non Gaussian ones were obtained from the Edgeworth 
expansion for different values of skewness and kurtosis (skewness S and kurtosis K input values indicated in column 1). 



S&K SF 



0.0&0.4 70.82/47.30 

O.l&O.O 30.97/13.69 

0.1&0.3 65.21/42.45 

0.1&0.4 83.37/64.62 

0.1&0.5 95.35/86.27 

0.1&0.7 99.90/99.56 

0.3&0.3 99.94/99.66 



Table 4. Average and dispersion for the skewness, kurtosis, 5th and 6th order cumulants obtained from 10000 simulations drawn 
from Edgeworth expansions assuming skewness, kurtosis, 5th and 6th order cumulant values denoted by S&cKSck58zk6{input). 



S&K&k5&k6 (input) 


Mean/Disp (S) 


Mean/Disp (K) 


Mean/Disp (k5) 


Mean/Disp (k6) 


0.1&0.4&1.0&3.0 
0.1&0.3&1.0&3.0 
0.1&0.4&0.8&3.0 
0.1&0.3&0.8&3.0 


0.0955/0.0673 
0.1264/0.0617 
0.0937/0.0677 
0.1397/0.0602 


0.3169/0.1540 
0.1475/0.1421 
0.3160/0.1563 
0.1123/0.1417 


0.8156/0.3246 
1.0634/0.2610 
0.6076/0.3450 
0.9523/0.2714 


1.3604/0.67157 

1.220,3/0.5402 
1.4349/0.6925 
1.2030/0.5468 



Table 5. Power of the skewness, kurtosis, 5th and 6th order cumulants at the 95% confidence level. Results drawn from 
10000 simulations based on Edgeworth expansions assuming skewness, kurtosis, 5th and 6th order cumulant values denoted by 
S&:Kkk5&zk6{input). 



S&K&k5&k6 (input) Power (S) Power (K) Power (k5) Power (k6) 

0.1&0.4&1.0&3.0 54.91 81.48 91.09 73.10 

0.1&0.3&1.0&3.0 73.52 41.46 99.72 69.97 

0.1&0.4&0.8&3.0 53.94 81.27 75.17 76.32 

0.1&0.3&0.8&3.0 80.72 32.17 98.86 68.17 



Table 6. Power at 95% confidence level for the different statistics presented in this work. These results are based on 10000 Gaussian 
and non Gaussian simulations. The non Gaussian ones were obtained from the Edgeworth expansion for different values of skewness, 
kurtosis, 5th and 6th order cumulants (input values for skew 5, kurt K, 5th fe5 and 6th fe6 order cumulants indicated in column 1. 



S&K&k5&k6 


53 


54 


S5 


56 


Wi 


W2 


W3 


W4 


SF 


0.1&0.4&1.0&3.0 


45.92 


78.11 


93.63 


97.54 


9.61 


17.04 


37.68 


61.16 


99.91 


0.1&0.3&1.0&3.0 


64.77 


60.28 


98.87 


98.98 


13.30 


14.18 


37.29 


41.48 


99.98 


0.1&0.4&0.8&3.0 


44.85 


78.46 


87.77 


94.78 


12.96 


20.30 


35.95 


52.81 


99.36 


0.1&0.3&0.8&3.0 


72.82 


64.30 


97.58 


97.54 


22.13 


21.61 


46.99 


44.46 


99.88 
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tion corresponding to the "data correlation matrix". To do 
that a cosmological model fitting the data has been assumed. 
MAXIMA data is better fitted to a cosmological model char- 
acterised by LOb = 0.105, u!c = 0.595, wa = 0.3 and h = 0.53 
(Balbi ot al. 2000). Simulations of this model arc used to 
calculate the signal correlation matrix. The so called "data 
correlation matrix" is the sum of the signal and noise corre- 
lation matrices. 

Once the data respond to independent values drawn 
from a normal distribution with mean zero and disper- 
sion one A'^(0, 1), we calculate the above introduced smooth 
goodness-of-fit statistics as well as the Shapiro-Prancia one. 
Noise levels are specially high at some pixels resulting in 
large correlation values off the diagonal. Most of these pix- 
els appear to be in the border of the observed region. To 
make sure the previous test is not dominated by noise we 
have selected pixels in the central area of the one covered 
by MAXIMA observations. We select pixels with right as- 
cention in the range 226.47 — 238.24 degs and declination 
ranging from 55.567 degs to 61.7 degs. These pixels amount 
to a total of 2164. In order to see if the MAXIMA data are 
compatible with Gaussianity we have simulated 50000 maps 
with 2164 pixels independently generated from a A'^(0, 1) dis- 
tribution. The statistic values obtained from the data as well 
as the probability of having Gaussian values larger than the 
MAXIMA ones are presented in Table 7. One should keep 
in mind that in the case of the Shapiro-Prancia test, this is 
a confidence level taken from the loft. As one can see, the 
MAXIMA data are compatible with Gaussianity. 

The process we follow to decorrelate the observations 
could bo thought to introduce some artifacts that could 
make the comparison with independent N{0, 1) simulations 
not appropiate. To answer this question we simulate data 
taking into account MAXIMA observational constraints as 
well as the noise information. Afterwards these simulations 
are dccorrelated following the same steps as in the the real 
data case. The statistical tests arc applied on these decor- 
related simulations and compared with the results obtained 
from the decorrelated MAXIMA data. 

As a first step we "tried" simulating CMB skies as those 
seen by MAXIMA. Those simulations included signal plus 
noise. The quoted word refers to the signal simulations. To 
simplify the simulation process we have based the simula- 
tions on the HEALpix pixelization (Gorski, Hivon & Wan- 
delt 1999). This does not exactly reproduce the observa- 
tional grid but we consider the approach good enough for 
the proposed test. These maps are simulated following these 
steps: 

1) The ai,n coefficients are generated assuming the power 
spectrum corresponding to the cosmological model that best 
fit the MAXIMA data (Balbi et al. 2000) multiplied by the 
beam pattern. 

2) We use the HEALpix subroutines to generate a map 
with the obtained aj^s. The maximum I corresponds to a 
nside = 512, that is, the generated pixels are 7x7 arcmin^. 
MAXIMA pixels are 8 x 8 arcmin^. Moreover, the MAXIMA 
pixelization does not agree with the HEALpix pixelization. 
The simulated pixel values are assigned to MAXIMA pix- 
els based on their right asccntion and declination. Since the 
processing time is quite long we only performed 300 simula- 
tions. Noise simulations are obtained by multiplying a pixel- 
to-pixel independently generated (from a Gaussian distribu- 



tion A''(0, 1)) map by the Cholesky matrix corresponding to 
the noise correlation matrix. 

The simulated maps are afterwards multiplied by the 
inverse of the Cholesky matrix as it is done with the data. 
Results for the different statistics evaluated on the MAX- 
IMA data and probability of these values to be drawn from 
a Gaussian distribution are presented in Table 8. The MAX- 
IMA values arc in agreement with Gaussian ones as was 
obtained in the previous case. The probability values ob- 
tained for the different statistics in Tables 7 and 8 are of 
the same order. One can however notice a larger difference 
in the case of the K statistic. We have checked the K val- 
ues obtained in both cases (for Tables 7 and 8). We have 
done the exercise of obtaining the distribution of \K\ val- 
ues directly from the K ones (case 1) and combining the 53 
and S4 values as indicated in section 2 (case 2). In case 1, 
the MAXIMA value is \Kmaxima\ = 0.058 (as indicated 
in Table 7) and the probability of getting values greater or 
equal than this one is 58.15% for simulations in Table 7 and 
65.00% for simulations in Table 8. In case 2, the MAXIMA 
value is \Kmaxima\ = 0.055 and the probability of getting 
values greater or equal than this one is 60.18% for simu- 
lations in Table 7 and 65.00% for sinnilations in Table 8. 
There is therefore compatibility between these two cases. 
Moreover, the distribution of the absolute value of the K 
statistic seems not to change much between simulations in 
Table 7 and those in Table 8. What might be happening 
is that the distribution of K values converges slowly and 
therefore 300 simulations, in the case of those in Table 8, 
might not be enough to asses a precise probability value. 
Nevertheless this result does not affect our final conclusion, 
that the MAXIMA data are found to be compatible with 
Gaussianity under the Goodness-of-fit tests applied in this 
work. 

Finally, we would like to note the importance of decorre- 
latirig the data in order to asses its departure from Gaussian- 
ity, by looking at the Goodness-of-fit statistics introduced 
in this paper. As already mentioned, these are statistics de- 
signed to test univariate Gaussianity. Applying these meth- 
ods to correlated data will only have partial meaning and 
their power will be diluted. Wo should however mention the 
fact that the decorrelation procedure will modify the distri- 
bution of the data we are analysing. It is difficult to quan- 
tify this effect as it would depend on the deviations from 
Gaussianity present in the analysed data as well as on the 
corresponding Cholesky matrix. Just as an example we have 
applied all the statistical tests discussed in this work to two 
cases in which a certain amount of non-Gaussianity is intro- 
duced in correlated simulations. The results are presented in 
Table 9. The power of the statistics in distinguishing Gaus- 
sian from non-Gaussian simulations is shown in columns 3 
and 4 for the two cases considered. In each column, the first 
number represents the power at the 95% c.l. when looking 
at correlated simulations. The second number indicates the 
power at the 95% c.l. after decorrelating the simulations (the 
inverse of the Cholesky matrix corresponding to MAXIMA 
data is used for this). The Gaussian simulations axe done 
as explained in the fourth paragraph of this Section, follow- 
ing MAXIMA'S constraints. Each of the non-Gaussian ones 
consists of the sum of a Gaussian simulation plus a non- 
Gaussian one with the same correlation (the fact that we 
have a sum of two simulations with the same correlation is 
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Table 7. Statistic values obtained for the MAXIMA data and probability of being drawn from a Gaussian distribution (Prob 
>= MAXIMA) from independent Gaussian simulations 



statistic 


data value 


Prob% 


S 


0.035 


25.23 


K 


0.058 


28.04 


k5 


-0.276 


89.43 


k6 


-0.068 


48.32 


S3 


0.446 


50.29 


S4 


0.717 


69.74 


S5 


2.091 


52.34 


Se 


2.101 


64.37 


Wi 


0.948 


33.23 


W2 


1.050 


59.64 


W3 


1.170 


76.29 


W4. 


1.177 


88.40 


SF 


0.9994 


30.06 



taken into account when dccorrclating). The non-Gaussian 
simulations axe generated based on an Edgeworth expan- 
sion as the ones in Section 2, afterwards multiplied by the 
Cholcsky matrix corresponding to the MAXIMA data. As 
can be seeing from the Table, any of the suggested methods 
have a larger power after decorrelating, even if then distri- 
butions have been modified in the process. 



4 DISCUSSION AND CONCLUSIONS 

The future detection of non Gaussianity in CMB data will 
rely on the power of the applied statistical methods. At 
present, it is not an easy task to establish which methods 
will be more powerful. Physically motivated deviations from 
Gaussianity gcuorated in theories alternative to the Stan- 
dard Inflationary one arc still not well characterised. It is 
however needed to know which methods could be better 
suited to detect certain types of non Gaussianity even if 
they are tested on toy model simulations. 

We have implemented goodness-of-fit statistics devel- 
oped to detect deviations from a given distribution, in our 
case from the Gaussian one. Three different methods have 
been tested against simulations including deviations from 
the Gaussian distribution. The selected methods were devel- 
oped by Shapiro & Francia (1972), Thomas & Pierce (1979) 
and Rayner & Best (1990). The last two ones belong to 
the so called smooth goodness-of-fit tests. The performance 
of these methods was checked on simulations based on the 
Edgeworth expansion including distortions produced by the 
presence of cumulants of order higher than two. A strong 
conclusion can not be drawn from this exercise. If only skew- 
ness and kurtosis are present, the statistic W2 developed by 
Thomeis & Pierce (1979) heis more power than the rest of 
applied statistics. The presence of cumulants of order 5 and 
6 seems to be better detected by the Shapiro- Francia test. 

Several statistical methods have already been applied to 
the MAXIMA data in the search for non Gaussianity. Wu et 
al. (2001) calculated moments, cumulants, Minkowski func- 
tionals, Kolmogorov and tests of the real and Wiener fil- 
tered data as well as of the eigenmodes and signal- whitened 
data. Santos et al. (2002) obtained the bispectrum value for 
these data. In both cases comparison with Gaussian predic- 



tions confirmed the compatibility of the MAXIMA data with 
Gaussianity. We have added three more tests to the ones ap- 
plied to MAXIMA (also see Ahaga et al. (2003) where con- 
straints on S snd K arc imposscd based on that data). The 
tests are optimal in the case of non-correlated data. More- 
over, they can be applied even in cases in which the obser- 
vations are not taken on a regular grid. As a first step in 
our method we have decorrelated the observations by multi- 
plying by the inverse of the corresponding Cholesky matrix. 
This is fccisible in this case in which only a region of the 
whole sky was covered as any matrix operation is computa- 
tionally very expensive. Future application of these methods 
could in any case be done by decorrelating data region by 
region of the sky. For the case analysed in this work, one can 
conclude that the MAXIMA data are compatible with Gaus- 
sianity under the three goodness-of-fit methods applied. 
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Table 8. Probability of the MAXIMA data of being drawn from a Gaussian distribution ( Prob >= MAXIMA). Simulations 
consist of the sum of signal plus noise following the MAXIMA observational constraints. Both the data and the simulations are 
decorrelated (as explained in the text) before they are analysed. 



statistic 


Prob% 


S 
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K 


48.33 




0^.00 


k6 


46.33 


S3 
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54 


73.00 


55 


57.67 


Se 
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Wi 


34.00 


W2 


63.67 


W3 


76.67 


W4 


89.00 


SF 


27.00 
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(bef/aft). For the two cases considered, the non-Gaussian simulations are generated as explained in the text and based on Edgeworth 
expansions with S = 0.5. K = 0.5 (cohmiri 2) and S = 0.7. K = 0.7 (cohmiri 3). 



statistic 


Power(95%) bef/aft S = 0.5, K = 0.5 


Power(95%) bef/aft S = 0.7, K = 0.7 
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